[fix](be) Avoid local runtime filter merge deadlock#64866
Conversation
### What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary: Local runtime filter merge can deadlock when one join build instance publishes a local-merge runtime filter while another instance sends its runtime filter size. The old local merge context lock protected both the merger and the producer list, so one path could hold a producer runtime filter lock and then wait for the context lock while another path held the context lock and then waited for a producer lock.
This change gives RuntimeFilterMerger its own internal synchronization and makes LocalMergeContext expose a snapshot of the merger and producers. Publish, send-size, and sync-size paths take the context lock only while copying that snapshot, then merge filters or update producer sizes outside the context lock. RuntimeFilterMerger returns the ready transition from merge_from directly, removing the separate unlocked ready check.
### Release note
None
### Check List (For Author)
- Test: Unit Test
- build-support/clang-format.sh be/src/exec/runtime_filter/runtime_filter_merger.h be/src/exec/runtime_filter/runtime_filter_mgr.cpp be/src/exec/runtime_filter/runtime_filter_mgr.h be/src/exec/runtime_filter/runtime_filter_producer.cpp be/test/exec/runtime_filter/runtime_filter_merger_test.cpp be/test/exec/runtime_filter/runtime_filter_mgr_test.cpp
- git diff --cached --check
- ./run-be-ut.sh --run --filter=RuntimeFilterMgrTest.*
- ./run-be-ut.sh --run --filter=RuntimeFilterMergerTest.*
- Behavior changed: No
- Does this need documentation: No
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
/review |
|
run buildall |
There was a problem hiding this comment.
Automated review summary for PR #64866
I reviewed the runtime-filter merger/producer/manager changes and the adjacent runtime-filter tests. I did not find a substantiated issue that should be raised as an inline review comment.
Critical checkpoints:
- Goal/test: the change targets the local runtime-filter merge deadlock by moving producer/merger work out of
LocalMergeContextlocking and synchronizingRuntimeFilterMergerinternally. The updated BE UTs cover the adjusted merger readiness and manager API behavior. I could not run BE UTs locally becausethirdparty/installedis missing in this checkout. - Scope/clarity: the patch is focused on runtime-filter locking, snapshots, and related unit-test updates.
- Concurrency/lifecycle: I checked producer
publish()/send_size(), local snapshots, merger locking, global merge reset, and recursive CTE reset/rebuild sequencing. I did not find a new lock-order cycle or a substantiated stale-stage publication issue. - Config/compatibility: no new config was added; the runtime-filter stage fields used on RPCs are existing optional protobuf fields.
- Parallel paths: local-only, local-merge, remote merge, size-sync, stale recursive-CTE RPC handling, and local consumer signaling paths were reviewed.
- Tests/style:
git diff --checkpassed, andbuild-support/check-format.shpassed whenclang-format16 was selected inPATH. BE UT execution was not possible locally because the checkout lacksthirdparty/installed.
Subagent conclusions:
optimizer-rewriteproposed OPT-1 about stale recursive-CTE local publication. I dismissed it with lifecycle evidence:WAIT_FOR_DESTROYcompletes old PFC teardown beforeREBUILDregisters new-stage consumers/producers, while remote stale messages are stage-checked.tests-session-configreported no candidates.- Convergence round 1 ended with both live subagents reporting
NO_NEW_VALUABLE_FINDINGSfor the final no-inline-comment set.
User focus: no additional user-provided review focus was supplied.
TPC-H: Total hot run time: 28845 ms |
TPC-DS: Total hot run time: 173691 ms |
ClickBench: Total hot run time: 25.24 s |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run buildall |
|
/review |
There was a problem hiding this comment.
Automated review summary for PR #64866.
I reviewed the six changed runtime-filter files and the relevant producer/helper, manager, pipeline prepare/submit, recursive CTE rerun, and sync/publish paths. I did not find a substantiated issue requiring an inline review comment.
Critical checkpoint conclusions:
- Goal: the PR targets a local runtime filter merge deadlock by moving merge/size work out from the old context lock and giving RuntimeFilterMerger its own synchronization. The changed
merge_from(..., &ready), size aggregation, shared context lookup, and producer publish/send-size call sites are consistent with that goal. - Scope and focus: the change is confined to BE runtime-filter manager/producer/merger code and matching BE unit-test API updates.
- Concurrency/lifecycle: I specifically checked producer registration, send-size, publish, global merge, and recursive CTE stage reset ordering. Producer registration happens during task prepare before fragment submit; parallel prepare waits for all instances; recursive CTE runs WAIT_FOR_DESTROY -> reset_global_rf -> REBUILD/prepare -> SUBMIT; stale-stage requests are filtered through stage checks. I did not find an executable same-stage registration/read race in the reviewed paths.
- Configuration, protocol, storage, persistence: no new config, thrift/protobuf field, storage format, or persistence behavior is introduced.
- Parallel paths: local merge publish, local size sync, and global merge-controller merge paths were updated to use the new ready transition. I did not find a missed parallel runtime-filter path in the changed surface.
- Tests: the PR updates BE unit tests for the new
merge_fromready result and local merge context API. I could not run BE UT in this checkout becausethirdparty/installedis absent. I did rungit diff --checkon the exact PR range, and it passed. - Observability/security: no security-sensitive path is changed; debug output is preserved through the new
LocalMergeContext::debug_string()path. - User focus: no additional user-provided review focus was supplied.
Subagent conclusions:
optimizer-rewrite: no optimizer/rewrite or runtime-filter semantic regression found.tests-session-config: raised TEST-1 about unsynchronizedLocalMergeContext::producersreads. I dismissed it with lifecycle evidence above; it remains a useful concurrent-regression test gap but not a substantiated inline issue.- Convergence round 1 ended with both live subagents reporting
NO_NEW_VALUABLE_FINDINGSfor the same ledger and empty proposed inline comment set.
TPC-H: Total hot run time: 29309 ms |
TPC-DS: Total hot run time: 171988 ms |
ClickBench: Total hot run time: 25.22 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run buildall |
TPC-H: Total hot run time: 29536 ms |
TPC-DS: Total hot run time: 171619 ms |
ClickBench: Total hot run time: 25.3 s |
|
/review |
There was a problem hiding this comment.
Automated review summary for PR #64866.
I found one issue: the changed unit tests do not cover the nonzero recursive-stage local merge path that the production change now relies on for context replacement and stale-stage filtering.
Critical checkpoint conclusions:
- Goal/test: the production change addresses local runtime-filter merge synchronization and recursive-stage stale filtering. Merger readiness, manager API renaming, and default stage-0 behavior are covered, but nonzero-stage local merge replacement/stale filtering is not covered.
- Scope/clarity: the patch is focused on BE runtime-filter manager/producer/merger code and matching BE unit-test API updates.
- Concurrency/lifecycle: I checked local merge registration, publish, send-size, sync-size, global merge reset, PipelineFragmentContext prepare/submit ordering, and recursive CTE WAIT_FOR_DESTROY -> reset_global_rf -> REBUILD -> SUBMIT ordering. The producer-vector synchronization notes overlap the existing inline thread and are not submitted again.
- Config/compatibility/persistence/storage: no new config, storage format, persistence, or FE-BE protocol field is introduced by this PR.
- Parallel paths: local merge publish, local size sync, global merge, stale-stage RPC checks, hash join and set sink runtime-filter paths were reviewed.
- Tests/style: exact-range
git diff --checkpassed locally. I did not run BE UTs because this checkout lacksthirdparty/installed; local clang-format validation was also not run because only clang-format 18 is available and Doris requires v16. Current GitHub status contexts show BE UT and compile success on the head; the failed macOS BE UT check exits early because the runner has Java 25 instead of JDK 17. - Observability/security: no security-sensitive behavior is changed; debug output is preserved through
LocalMergeContext::debug_string().
Subagent conclusions:
optimizer-rewrite: OPT-1 was treated as duplicate of the existing producer-vector synchronization thread; convergence round 1 returnedNO_NEW_VALUABLE_FINDINGS.tests-session-config: TSC-1 was merged as the same duplicate synchronization point; TSC-2 became the single inline test-coverage comment; convergence round 1 returnedNO_NEW_VALUABLE_FINDINGS.
User focus: no additional user-provided review focus was supplied.
### What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary: Local runtime filter merge context replacement used a reference to the map entry, making it unclear whether a recursive CTE stage reset replaced the shared context or mutated a context visible through existing shared_ptr owners. This change explicitly inserts or assigns a new LocalMergeContext when a newer stage registers, and removes the extra LocalMergeContext debug string snapshot so debug output stays close to the existing manager-level format.
### Release note
None
### Check List (For Author)
- Test: Manual test
- python3 build-support/run_clang_format.py --clang-format-executable /mnt/disk6/common/ldb_toolchain_toucan/bin/clang-format --style file --inplace true be/src/exec/runtime_filter/runtime_filter_mgr.cpp be/src/exec/runtime_filter/runtime_filter_mgr.h
- git diff --check -- be/src/exec/runtime_filter/runtime_filter_mgr.cpp be/src/exec/runtime_filter/runtime_filter_mgr.h
- Behavior changed: No
- Does this need documentation: No
|
run buildall |
|
/review |
There was a problem hiding this comment.
Automated review result: found one additional blocking issue.
Critical checkpoints:
- Goal/test: the PR addresses local runtime-filter merge/context handling; current tests cover basic manager and merger paths, while the missing nonzero recursive-stage test is already covered by existing thread discussion_r3487509240.
- Concurrency/lifecycle: prepare-before-submit and recursive CTE WAIT_FOR_DESTROY -> REBUILD -> SUBMIT ordering make the producer-vector race unsubstantiated, but the current debug path still has a concrete manager-lock / producer-lock inversion; see inline comment.
- Config/protocol/compatibility: no new config item or mixed-version protocol issue found; stage propagation through merge/size/publish paths was reviewed.
- Parallel paths: local/global merge publish and size-sync paths were reviewed; no additional semantic issue found beyond the inline comment and existing test-coverage thread.
- Tests/validation: exact-range git diff --check passed. Targeted BE UT execution did not reach the tests because run-be-ut failed during gensrc setup: thirdparty/installed/bin/protoc is missing. GitHub formatter checks are green in the current PR status.
- User focus: no additional user-provided focus was present.
Subagent conclusions: optimizer-rewrite candidate OPT-1 was verified and became MAIN-1, the single inline comment. tests-session-config reported no new valuable findings. The existing nonzero-stage coverage concern was treated as duplicate of discussion_r3487509240, and the producer-vector race was dismissed with lifecycle evidence. Convergence round 1 ended with both live subagents replying NO_NEW_VALUABLE_FINDINGS for the same current ledger/comment set.
|
PR approved by at least one committer and no changes requested. |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 29344 ms |
TPC-DS: Total hot run time: 172292 ms |
ClickBench: Total hot run time: 25.16 s |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run buildall |
|
/review |
There was a problem hiding this comment.
I found one compile-blocking issue in the changed runtime-filter unit tests. The production API was renamed to increase_expected_producer_num(...), but RuntimeFilterMergerTest still calls the removed set_expected_producer_num(...) helper in four places.
Checkpoint conclusions: the PR goal is clear and the production change is focused on runtime-filter local/global merge locking and recursive-stage filtering; the changed code generally follows the intended lock-order direction by avoiding manager-lock calls into runtime-filter objects. The main remaining blocker is test/build correctness. Concurrency/lifecycle around recursive CTE stage reset was reviewed, including WAIT_FOR_DESTROY -> reset_global_rf -> REBUILD -> SUBMIT and the deferred close guard; I did not find a new substantiated runtime lifecycle issue beyond the existing open stage-coverage and debug-lock comments. No new config, persistence, storage format, FE/BE protocol field, or observability requirement was introduced beyond the stage fields already being propagated in runtime-filter RPCs. Test coverage is still partially covered by existing review thread discussion_r3487509240 for nonzero-stage/stale-stage behavior, and this review adds the separate compile blocker below.
Validation: git diff --check passed for the changed file set. Local BE UT execution was not available because this checkout is missing thirdparty/installed. The current macOS BE UT GitHub check failed before compilation due runner Java 25 vs required JDK 17, so it did not independently validate this compile issue.
User focus: no additional user-provided review focus was supplied.
Subagent conclusions: tests-session-config reported TSC-001, accepted here as the inline comment, and TSC-DUP-001, dismissed as duplicate of existing thread discussion_r3487509240. optimizer-rewrite independently reported the same stale test API issue and then marked it duplicate of MAIN-001; it found no additional optimizer/rewrite semantic issue. Convergence round 1 ended with both live subagents replying NO_NEW_VALUABLE_FINDINGS for the same current ledger/comment set.
| @@ -66,12 +69,14 @@ class RuntimeFilterMergerTest : public RuntimeFilterTest { | |||
| merger->set_expected_producer_num(1); | |||
There was a problem hiding this comment.
This test file still calls the old RuntimeFilterMerger::set_expected_producer_num(...) helper, but the PR removed that member and replaced the production users with increase_expected_producer_num(...). The only remaining set_expected_producer_num references are the four calls in this file, so RuntimeFilterMergerTest will fail to compile once the BE UT target is built. Please update these calls to the new helper name, or keep a compatibility wrapper if the old API is still intended.
TPC-H: Total hot run time: 29797 ms |
TPC-DS: Total hot run time: 171230 ms |
ClickBench: Total hot run time: 25.31 s |
|
run buildall |
TPC-H: Total hot run time: 29970 ms |
TPC-DS: Total hot run time: 172400 ms |
ClickBench: Total hot run time: 25.05 s |
What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary: Local runtime filter merge can deadlock when one join build instance publishes a local-merge runtime filter while another instance sends its runtime filter size. The old local merge context lock protected both the merger and the producer list, so one path could hold a producer runtime filter lock and then wait for the context lock while another path held the context lock and then waited for a producer lock.
This change gives RuntimeFilterMerger its own internal synchronization and makes LocalMergeContext expose a snapshot of the merger and producers. Publish, send-size, and sync-size paths take the context lock only while copying that snapshot, then merge filters or update producer sizes outside the context lock. RuntimeFilterMerger returns the ready transition from merge_from directly, removing the separate unlocked ready check.
Release note
None
Check List (For Author)